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FIG. 5 



MapofPCRn-TOPO-IPAAA44548 



Molecule: 
File Name: 
Description: 



pCRII-TOPO-IPAAA44548, 4214 bps DNA Circular 
13124. cm5 
Plasmid ID 13124 



Molecule Features: 



Type 


Start 


End 


Name 


MARKER 


239 




SP6 


REGION 


337 


600 




GENE 


577 


341 C 44548 cds 


MARKER 


670 




C T7 


REGION 


854 


1268 


fl ori 


GENE 


1602 


2396 


KanR 


GENE 


2414 


3274 


AmpR 


REGION 


3419 


4092 


pUC ori 



Description 



IPAAA44548 cloned sequence 



FI& 6 



Map of expression vector p£AK12d 



Molecule: pEAK12 d, 8760 bps DNA Circular 

File Name: pEAK12DEST.cm5 

Description: Mammalian cell expression vector (plasmid ID 11345) 
Molecule Features: 



Type 


Start 


End 




Name 


Description 


REGION 


2 


595 






pmb-ori 


GENE 


596 


1519 




Amp 




REGION 


1690 


2795 




EF-lalpha 




REGION 


2703 


2722 






position of pEAK12F 


REGION 


2796 


2845 






MCS 


MARKER 


2855 






attRl 




GENE 


3256 


3915 




CmR 




GENE 


4257 


4562 




cede 




MARKER 


4603 




C 


attR2 




REGION 


4733 


4733 






MCS 


REGION 


4734 


5162 






poly A/splice 


REGION 


4819 


4848 


C 




position of pEAK12R 


GENE 


5781 


5163 


C 


PUR 


PUROMYCIN 


REGION 


6005 


5782 


C 


tK 


tK promoter 


REGION 


6500 


6006 


C 


Ori P 




GENE 


8552 


6500 


C 


EBNA-1 




REGION 


8553 


8752 




sv40 





FIG 7 

MapofplasmidpDONRlOl 



Molecule: pDONR201, 4470 bps DNA Circular 

File Name: pDONR201.cm5, dated 17 Oct 2002 

Description: Gateway entry vector (Invitrogen) - plasmid ID# 13309 



Molecule Features: 



Type 


Start 


End 


Name 


REGION 


332 


563 


attPl 


GENE 


959 


1264 


ccdB 


REGION 


2513 


2744 


attP2 


GENE 


2868 


3677 


KanR 


REGION 


3794 


4467 


pUC ori 



FI&8 



Map of expression vector pEAK12d-IPAAA44548-6HIS 

Molecule: pEAK12d-IPAAA44548-6HIS, 7201 bps DNA Circular 

File Name: 11775.cni5 

Description: Mammalian cell Expression Construct 
Molecule Features: 



Type 


Start 


End 




Name 


Description 


REGION 


2 


595 






prab-ori 


GENE 


596 


1519 




Amp 




REGION 


1690 


2795 




EF-la 




REGION 


2703 


2722 






peakl2D-F primer 


MARKER 


2855 






attBl 




GENE 


2888 


3139 




IPAAA44548- 


-6HIS 


MARKER 


3155 






attB2 




REGION 


3175 


3603 




•A 


poly A/splice 


REGION 


3289 


3270 


C 




pEAK12D-R primer 


GENE 


4222 


3604 


C 




PUROMYCIN 


REGION 


4446 


4223 


C 


tK 


tK promoter 


REGION 


4941 


4447 


C Ori P 




GENE 


6993 


4941 


C 


EBNA-1 




REGION 


6994 


7193 




sv40 





FI& 9 



Map of Kcoli expression vector pDEST14 



Molecule: 
File Name; 



pDEST14, 6422 bps DNA Circular 
pDEST14.cm5, dated 17 Oct 2002 



Description: E.coli expresssion vector (Invitrogen) 

Notes: Gateway compatible, Expression under control of T7 

promoter 

Molecule Features: 



Type 


Start 


End 


Name 


Description 


MARKER 


21 




T7 


Promoter 


REGION 


67 


191 


attRl 




GENE 


441 


1100 


CmR 




GENE 


1442 


1747 


ccdB 




REGION 


1788 


1912 


attR2 




REGION 


1964 


1944 C 




pDEST14 R primer 


GENE 


2638 


3498 


AmpR 




REGION 


3643 


4316 


pBR322 ori 





FIG. 10 



Map of plasmidpDEST14-IPAAA44548-6fflS 

Molecule: pDEST14-IPAAA44548-6HIS, 4899 bps DNA Circular 

File Name: 12896. cm5 
Description: plasmid ID 12896 



Molecule Features: 



Type 


Start 


End 


Name 


Description 


MARKER 


21 




T7 




REGION 


72 


67 


C attBl 




REGION 


94 


108 




Shine Dalgarno Sequence 


GENE 


109 


360 


IPAAA44548- 


-6HIS 


REGION 


376 


389 


attB2 




REGION 


441 


421 


C 


pDEST14-R primer 


GENE 


1115 


1975 


Amp 




REGION 


2124 


2763 


ori 


pBR322 ori 



FIG. 11 PCRII TOPO IPAAA44548 

1 AGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC 

61 ACGACAGGTT TCCCGACTGG AAAGCGGGCA GTGAGCGCAA CGCAATTAAT GTGAGTTAGC 

121 TCACTCATTA GGCACCCCAG GCTTTACACT TTATGCTTCC GGCTCGTATG TTGTGTGGAA 

181 TTGTGAGCGG ATAACAATTT CACACAGGAA ACAGCTATGA CCATGATTAC GCCAAGCTAT 

241 TTAGGTGACA CTATAGAATA CTCAAGCTAT GCATCAAGCT TGGTACCGAG CTCGGATCCA 

301 CTAGTAACGG CC6CCAGTGT GCTGGAATTC GCCCTTCATT CTAAAGT6TG CCATCTGCAT 

361 TTCTCAACTC CAGAATTTCT GCTTGATTCC CTTTAATTAT TTCAATCTGT TTCTTATATT 

421 TGTCTGATAG AATTCTGGAT TCCTTCTCTG TGTTATCTTG AATTTCCTTG AGGTTCTTCA 

481 ACACA6ATAT TTTGAATTCT GTGTCTGAAA GGTCACATAT CTCT6TTTCT CCAGGATTGG 

541 TCCATGGCAG CTTATTTAGT TCGTTTGGTG AAGTCATGTT TTACTGGATG TT6TTGATGC 

601 AAGGGCGAAT TCTGCAGATA TCCATCACAC TGGCGGCCGC TCGAGCATGC ATCTAGAGGG 

661 CCCAATTCGC CCTATAGTGA GTCGTATTAC AATTCACTGG CCGTCGTTTT ACAAC6TCGT 

721 GACTGGGAAA ACCCTGGCGT TACCCAACTT AATCGCCTTG CAGCACATCC CCCTTTCGCC 

781 AGCTGGCGTA ATAGCGAAGA GGCCCGCACC GATCGCCCTT CCCAACAGTT GCGCAGCCTG 

841 AATGGCGAAT GGGACGCGCC CTGTAGCGGC GCATTAAGC6 CGGCGGGTGT GGTGGTTACG 

901 CGCAGCGTQA CCGCTACACT TGCCAGCGCC CTAGCGCCCG CTCCTTTCGC TTTCTTCCCT 

961 TCCTTTCTCG CCACGTTCGC CGGCTTTCCC CGTCAAGCTC TAAATCGGGG GCTCCCTTTA 

1021 GGGTTCCGAT TTAGAGCTTT ACGGCACCTC GACCGCAAAA AACTTGATTT GGGTGATGGT 

1081 TCACGTAGTG GGCCATCGCC CTGATAGACG GTTTTTCGCC CTTTGACGTT GGAGTCCACG 

1141 TTCTTTAATA GTGGACTCTT GTTCCAAACT GGAACAACAC TCAACCCTAT CGCGGTCTAT 

1201 TCTTTTGATT TATAAGGGAT TTTGCCGATT TCGGCCTATT GGTTAAAAAA TGAGCTGATT 

1261 TAACAAATTC AGGGCGCAAG GGCTGCTAAA GGAACCGGAA CACGTAGAAA GCCAGTCCGC 

1321 AGAAACGGTG CTGACCCCGG ATGAATGTCA GCTACTGGGC TATCTGGACA AGGGAAAACG 

1381 CAAGCGCAAA GAGAAAGCAG GTAGCTTGCA GTGGGCTTAC ATGGCGATAG CTAGACTGGG 

1441 CGGTTTTATG GACAGCAAGC GAACCGGAAT TGCCAGCTGG GGCGCCCTCT GGTAAGGTTG 



FIG. 11(contd.) 

1501 GGAAGCCCTG CAAAGTAAAC TGGATGGCTT 

1561 GATCAAGATC TGATCAAGAG ACAGGATGAG 

1621 TGCACGCAGG TTCTCCGGCC GCTTGGGTGG 

1681 AGACAATCGG CTGCTCTGAT GCCGCCGTGT 

1741 TTTTTGTCAA GACCGACCTG TCCGGTGCCC 

1801 TATCGTGGCT GGCCACGACG GGCGTTCCTT 

1861 CGGGAAGGGA CTGGCTGCTA TTGGGCGAAG 

1921 TTGCTCCTGC CGAGAAAGTA TCCATCATGG 

1981 ATCCGGCTAC CTGCCCATTC GACCACCAAG 

2041 GGATGGAAGC CGGTCTTGTC GATCAGGATG 

2101 CAGCCGAACT GTTCGCCAGG CTCAAGGCGC 

2161 TCCATGGCGA TGCCTGCTTG CCGAATATCA 

2221 ACGACTGTGG CCGGCTGGGT GTGGCGGACC 

2281 ATATTGCTGA AGAGCTTGGC GGCGAATGGG 

2341 CCGCTCCCGA TTCGCAGCGC ATCGCCTTCT 

2401 AAAAAGGAAG AGTATGAGTA TTCAACATTT 

2461 ATTTTGCCTT CCTGTTTTTG CTCACCCAGA 

2521 TCAGTTGGGT GCACGAGTGG GTTACATCGA 

2581 GAGTTTTCGC CCCGAAGAAC GTTTTCCAAT 

2641 TACACTATTA TCCCGTATTG ACGCCGGGCA 

2701 TCAGAATGAC TTGGTTGAGT ACTCACCAGT 

2761 AGTAAGAGAA TTATGCAGTG CTGCCATAAC 

2821 TCTGACAACG ATCGGAGGAC CGAAGGAGCT 

2881 TGTAACTCGC CTTGATCGTT GGGAACCGGA 

2 941 TGACACCACG ATGCCTGTAG CAATGCCAAC 

3 001 ACTTACTCTA GCTTCCCGGC AACAATTAAT 
3061 ACCACTTCTG CGCTCGGCCC TTCCGGCTGG 



TCTTGCCGCC AAGGATCTGA TGGCGCAGGG 
GATCGTTTCG CATGATTGAA CAAGATGGAT 
AGAGGCTATT CGGCTATGAC TGGGCACAAC 
TCCGGCTGTC AGCGCAGGGG CGCCCGGTTC 
TGAATGAACT GCAGGACGAG GCAGCGCGGC 
GCGCAGCTGT GCTCGACGTT GTCACTGAAG 
TGCCGGGGCA GGATCTCCTG TCATCTCGCC 
CTGATGCAAT GCGGCGGCTG CATACGCTTG 
CGAAACATCG CATCGAGCGA GCACGTACTC 
ATCTGGACGA AGAGCATCAG GGGCTCGCGC 
GCATGCCCGA CGGCGAGGAT CTCGTCGTGA 
TGGTGGAAAA TGGCCGCTTT TCTGGATTCA 
GCTATCAGGA CATAGCGTTG GATACCCGTG 
CTGACCGCTT CCTCGTGCTT TACGGTATCG 
ATCGCCTTCT TGACGAGTTC TTCTGAATTG 
CCGTGTCGCC CTTATTCCCT TTTTTGCGGC 
AACGCTGGTG AAAGTAAAAG ATGCTGAAGA 
ACTGGATCTC AACAGCGGTA AGATCCTTGA 
GATGAGCACT TTTAAAGTTC TGCTATGTGA 
AGAGCAACTC GGTCGCCGCA TACACTATTC 
CACAGAAAAG CATCTTACGG ATGGCATGAC 
CATGAGTGAT AACACTGCGG CCAACTTACT 
AACCGCTTTT TTGCACAACA TGGGGGATCA 
GCTGAATGAA GCCATACCAA ACGACGAGAG 
AACGTTGCGC AAACTATTAA CTGGCGAACT 
AGACTGAATG GAGGCGGATA AAGTTGCAGG 
CTGGTTTATT GCTGATAAAT CTGGAGCCGG 



3121 TGAGCGTGGG TCTCGCGGTA TCATTGCAGC ACTGGGGCCA GATGGTAAGC GCTCCCGTAT 

3181 CGTAGTTATC TACACGACGG GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC 

3241 TGAGATAGGT GCCTCACTGA TTAAGCATTG GTAACTGTCA GACCAAGTTT ACTCATATAT 

3301 ACTTTAGATT GATTTAAAAC TTCATTTTTA ATTTAAAAGG ATCTAGGTGA AGATCCTTTT 

3361 TGATAATCTC ATGACCAAAA TCCCTTAACG TGAGTTTTCG TTCCACTGAG CGTCAGACCC 

3421 CGTAGAAAAG ATCAAAGGAT CTTCTTGAGA TCCTTTTTTT CTGCGCGTAA TCTGCTGCTT 

3481 GCAAACAAAA AAACCACCGC TACCAGCGGT GGTTTGTTTG CCGGATCAAG AGCTACCAAC 

3541 TCTTTTTCCG AAGGTAACTG GCTTCAGCAG AGCGCAGATA CCAAATACTG TCCTTCTAGT 

3601 GTAGCCGTAG TTAGGCCACC ACTTCAAGAA CTCTGTAGCA CCGCCTACAT ACCTCGCTCT 

3661 GCTAATCCTG TTACCAGTGG CTGCT6CCAG TGGCGATAAG TCGTGTCTTA CCGGGTTGGA 

3721 CTCAAGACGA TAGTTACCGG ATAAGGCGCA GCGGTCGGGC TGAACGGGGG GTTCGTGCAC 

3781 ACAGCCCAGC TTGGAGCGAA CGACCTACAC CGAACTGAGA TACCTACAGC GTGAGCTATG 

3841 AGAAAGCGCC ACGCTTCCCG AAGGGAGAAA GGCGGACAGG TATCCGGTAA GCGGCAGGGT 

3901 CGGAACAGGA GAGCGCACGA GGGAGCTTCC AGGGGGAAAC GCCTGGTATC TTTATAGTCC 

3961 TGTCGGGTTT CGCCACCTCT GACTTGAGCG TCGATTTTTG TGATGCTCGT CAGGGGGGCG 

4021 GAGCCTATGG AAAAACGCCA GCAACGCGGC CTTTTTACGG TTCCTGGGCT TTTGCTGGCC 

4081 TTTTGCTCAC ATGTTCTTTC CTGCX3TTATC CCCTGATTCT 6TGGATAACC GTATTACCGC 

4141 CTTTGAGTGA GCTGATACCG CTCGCCGCAG CCGAACGACC GAGCGCAGCG AGTCAGTGAG 

4201 CGAGGAA6CG 6AAG 



FIG. 11(contd.) 



FIG. 12 pDEST14-IPAAA44S48-6fflS 

1 AGATCTCGAT CCCGCGAAAT TAATACGACT CACTATAGGG AGACCACAAC GGTTTCCCTC TAGATCACAA GTTTGTACAA 

81 AAAAGCAGGC TTCGAAGGAG ATATACATAT GACTTCACCA AACGAACTAA ATAAGCTGCC ATGGACCAAT CCTGGAGAAA 

161 CAGAGATATG TGACCTTTCA GACACAGAAT TCAAAATATC TGTGTTGAAG AACCTCAAGG AAATTCAAGA TAACACAGAG 

241 AAGGAATCCA GAATTCTATC AGACAAATAT AAGAAACAGA TTGAAATAAT TAAAGGGAAT CAAGCAGAAA TTCTGGAGTT 

321 GAGAAATGCA GATGGCACAC TTCACCATCA CCATCACCAT TGAAACCCAG CTTTCTTGTA CAAAGTGGTG ATGATCCG6C 

401 TGCTAACAAA GCCCGAAAGG AAGCTGAGTT GGCTGCTGCC ACCGCTGAGC AATAACTAGC ATAACCCCTT GGGGCCTCTA 

481 AACGGGTCTT GAGGGGTTTT TTGCTGAAAG GAGGAACTAT ATCCGGATAT CCACAGGACG GGTGTGGTCG CCATGATCGC 

561 GTAGTCGATA GTGGCTCCAA GTAGCGAAGC GAGCAGGACT GGGCGGCGGC CAAAGCGGTC GGACAGTGCT CCGAGAACGG 

641 GTGCGCATAG AAATTGCATC AACGCATATA GCGCTAGCAG CACGCCATAG TGACTGGCGA TGCTGTCGGA ATGGACGATA 

721 TCCCGCAAGA GGCCCGGCAG TACCGGCATA ACCAAGCCTA TGCCTACAGC ATCCAGGGTG ACGGTGCCGA GGATGACGAT 

801 GAGCGCATTG TTAGATTTCA TACACGGTGC CTGACTGCGT TAGCAATTTA ACTGTGATAA ACTACCGCAT TAAAGCTTAT 

881 CGATGATAAG CTGTOU^ACA TGAGAATTCT TGAAGACGAA AGGGCCTCGT GATACGCCTA TTTTTATAGG TTAATGTCAT 

961 GATAATAATG GTTTCTTAGA CGTCAGGTGG CACTTTTCGG GGAAATGTGC GCGGAACCCC TATTTGTTTA TTTTTCTAAA 

1041 TACATTCAAA TATGTATCCG CTCATGAGAC AATAACCCTG ATAAATGCTT CAATAATATT GAAAAAGGAA GAGTATGAGT 

1121 ATTCAACATT TCCGTGTCGC CCTTATTCCC TTTTTTGCGG CATTTTGCCT TCCTGTTTTT GCTCACCCAG AAACGCTGGT 

1201 GAAAGTAAAA GATGCTGAAG ATCAGTTGGG TGCACGAGTG GGTTACATCG AACTGGATCT CAACAGCGGT AAGATCCTTG 

1281 AGAGTTTTCG CCCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAAGTT CTGCTATGTG GCGCGGTATT ATCCCGTGTT 

1361 GACGCCGGGC AAGAGCAACT CGGTCGCCGC ATACACTATT CTCAGAATGA CTTGGTTGAG TACTCACCAG TCACAGAAAA 

1441 GCATCTTACG GATGGCATGA CAGTAAGAGA ATTATGCAGT GCTGCCATAA CCATGAGTGA TAACACTGCG GCCAACTTAC 

1521 TTCTGACAAC GATCGGAGGA CC6AAGGAGC TAACCGCTTT TTTGCACAAC ATGGGGGATC ATGTAACTCG CCTTGATCGT 

1601 TGGGAACCGG AGCTGAATGA AGCCATACCA AACGACGAGC GTGACACCAC GATGCCTGCA GCAATGGCAA CAACGTTGCG 

1681 CAAACTATTA ACTGGCGAAC TACTTACTCT AGCTTCCCGG CAACAATTAA TAGACTGGAT GGAGGCGGAT AAAGTTGCAG 

1761 GACCACTTCT GCGCTCGGCC CTTCCGGCTG GCTGGTTTAT TGCTGATAAA TCTGGAGCCG GTGAGCGTGG GTCTCGCGGT 

1841 ATCATTGCAG CACTGGGGCC AGATGGTAAG CCCTCCCGTA TCGTAGTTAT CTACACGACG GGGAGTCAGG CAACTATGGA 

1921 TGAACGAAAT A6ACAGATCG CTGAGATAGG TGCCTCACTG ATTAAGCATT GGTAACTGTC AGACCAAGTT TACTCATATA 

2001 TACTTTAGAT TGATTTAAAA CTTCATTTTT AATTTAAAA6 GATCTAGGTG AAGATCCTTT TTGATAATCT CATGACCAAA 

2081 ATCCCTTAAC GTGAGTTTTC GTTCCACTGA GCGTCAGACC CCGTAGAAAA GATCAAAGGA TCTTCTTGAG ATCCTTTTTT 

2161 TCTGCGCGTA ATCTGCTGCT TGCAAACAAA AAAACCACCG CTACCAGCGG TGGTTTGTTT GCCGGATCAA GAGCTACCAA 

2241 CTCTTTTTCC GAAGGTAACT GGCTTCAGCA GAGCGCAGAT ACCAAATACT GTCCTTCTAG TGTAGCCGTA GTTAGGCCAC 

2321 CACTTCAAGA ACTCTGTAGC ACCGCCTACA TACCTCGCTC TGCTAATCCT GTTACCAGTG GCTGCTGCCA GTGGCGATAA 



2401 GTCGTGTCTT ACCGGGTTGG ACTCAAGACG ATAGTTACCG GATAAGGCGC AGCGGTCGGG CTGAACGGGG GGTTCGTGCA 

24 81 CACAGCCCAG CTTGGAGCGA ACGACCTACA CCGAACTGAG ATACCTACAG CGTGAGCTAT GAGAAAGCGC CACGCTTCCC 

2561 GAAGGGAGAA AGGCGGACAG GTATCCGGTA AGCGGCAGGG TCGGAACAGG AGA6CGCACG AGGGAGCTTC CAGGGGGAAA 

2641 CGCCTGGTAT CTTTATAGTC CTGTCGGGTT TCGCCACCTC TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC 

2721 GGAGCCTATG GAAAAACGCC AGCAACGCGG CCTTTTTACG GTTCCTGGCC TTTTGCTGGC CTTTTGCTCA CATGTTCTTT 

2801 CCTGCGTTAT CCCCTGATTC TGTGGATAAC CGTATTACCG CCTTTGA6TG AGCTGATACC GCTCGCCGCA GCCGAACGAC 

2881 CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC GGAAGAGCGC CTGATGCGGT ATTTTCTCCT TACGCATCTG TGCGGTATTT 

2961 CACACCGCAT ATATGGTGCA CTCTCAGTAC AATCTGCTCT GATGCCGCAT AGTTAAGCCA GTATACACTC CGCTATCGCT 

3041 ACGTGACTGG GTCATGGCTG CGCCCCGACA CCCGCCAACA CCCGCTGACG CGCCCTGACG GGCTTGTCTG CTCCCGGCAT 

3121 CCGCTTACAG ACAAGCTGTG ACCGTCTCCG GGAGCTGCAT GTGTCAGAGG TTTTCACCGT CATCACCGAA ACGCGCGAGG 

3201 CAGCTGCGGT AAAGCTCATC AGCGTGGTCG TGAAGCGATT CACAGATGTC TGCCTGTTCA TCCGCGTCCA 6CTCGTTGAG 

3281 TTTCTCCAGA AGCGTTAATG TCTGGCTTCT GATAAAGCGG GCCATGTTAA GGGCGGTTTT TTCCTGTTTG GTCACTGATG 

3361 CCTCCGTGTA AGGGGGATTT CTGTTCATGG GGGTAATGAT ACCGATGAAA CGAGAGAGGA TGCTCACGAT ACGGGTTACT 

3441 GATGATGAAC ATGCCCGGTT ACTGGAACGT TGTGAGGGTA AACAACTGGC GGTATGGATG CGGCGGGACC AGAGAAAAAT 

3521 CACTCAGGGT CAAT6CCAGC GCTTCGTTAA TACAGATGTA GGTGTTCCAC AGGGTAGCCA GCAGCATCCT GCGATGCAGA 

3601 TCCGGAACAT AATGGTGCAG GGCGCTGACT TCCGCGTTTC CAGACTTTAC GAAACACGGA AACCGAAGAC CATTCATGTT 

3681 GTTGCTCAGG TCGCAGACGT TTTGCAGCAG CAGTCGCTTC ACGTTCGCTC GCGTATCGGT 6ATTCATTCT GCTAACCAGT 

3761 AAGGCAACCC CGCCAGCCTA GCCGGGTCCT CAACGACAGG AGCACGATCA TGCGCACCCG TGGCCAGGAC CCAACGCTGC 

3841 CCGAGATGCG CCGCGTGCGG CTGCTGGAGA TGGCGGACGC GATGGATATG TTCTGCCAAG GGTTGGTTTG CGCATTCACA 

3921 GTTCTCCGCA AGAATT6ATT GGCTCCAATT CTTGGAGTGG TGAATCCGTT AGCGAGGTGC CGCCGGCTTC CATTCAGGTC 

4001 GAGGTGGCCC GGCTCCATGC ACCGCGACGC AACGCGGGGA GGCAGACAAG GTATAGGGCG GCGCCTACAA TCCATGCCAA 

4081 CCCGTTCCAT GTGCTCGCCG AGGCGGCATA AATCGCCGTG ACGATCAGCG GTCCAGTGAT CGAAGTTAGG CTGGTAAGAG 

4161 CCGCGAGCGA TCCTTGAAGC TGTCCCTGAT GGTCGTCATC TACCTGCCTG GACAGCATGG CCTGCAACGC GGGCATCCCG 

4241 ATGCCGCCGG AAGCGAGAAG AATCATAATG GGGAAGGCCA TCCAGCCTCG CGTCGCGAAC GCCAGCAAGA CGTAGCCCAG 

4321 CGCGTCGGCC GCCATGCCGG CGATAATGGC CTGCTTCTCG CCGAAACGTT TGGTGGCGGG ACCAGTGACG AAGGCTTGAG 

4401 CGAGGGCGTG CAAGATTCCG AATACCGCAA GCGACAGGCC GATCATCGTC GCGCTCCAGC GAAAGCGGTC CTCGCCGAAA 

4481 ATGACCCAGA GCGCTGCCGG CACCTGTCCT ACGAGTTGCA TGATAAAGAA GACAGTCATA AGTGCGGCGA CGATAGTCAT 

4561 GCCCCGCGCC CACCGGAAGG AGCTGACTGG GTTGAAGGCT CTCAAGGGCA TCGGTCGATC GACGCTCTCC CTTATGCGAC 

4641 TCCTGCATTA GGAAGCAGCC CAGTAGTAGG TTGAGGCCGT TGAGCACCGC CGCCGCAAGG AATGGTGCAT GCAAGGAGAT 

4721 GGCGCCCAAC AGTCCCCCGG CCACGGGGCC TGCCACCATA CCCACGCCGA AACAAGCGCT CATGAGCCCG AAGTGGCGAG 

4801 CCCGATCTTC CCCATCGGTG ATGTCGGCGA TATAGGCGCC AGCAACCGCA CCTGTGGCGC CGGTGATGCC GGCCACGATG 

4881 CGTCCGGCGT AGAGGATCG FIG. 12(contd.) 



FIG. 13 pEAK12D-IPAAA44548-6fflS 

1 GGCGTAATCT GCTGCTTGCA AACAAAAAAA CCACCGCTAC CAGCGGTGGT TTGTTTGCCG GATCAAGAGC TACCAACTCT 
81 TTTTCCGAAG GTAACTGGCT TCAGCAGAGC GCAGATACCA AATACTGTCC TTCTAGTGTA GCCGTAGTTA GGCCACCACT 
161 TCAAGAACTC TGTAGCACCG CCTACATACC TCGCTCTGCT GAAGCCAGTT ACCAGTGGCT GCTGCCAGTG GCGATAAGTC 
241 GTGTCTTACC GGGTTGGACT CAAGAGATAG TTACCGGATA AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT CGTGCACACA 
321 GCCCAGCTTG GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG AGCTATGAGA AAGCGCCACG CTTCCCGAAG 
401 GGAGAAAGGC GGACAGGTAT CCGGTAAGCG GCAGGGTCGG AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC 
481 TGGTATCTTT ATAGTCCTGT CGGGTTTCGC CACCTCTGAC TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG 
561 CCTATGGAAA AACGCCAGCA ACGCAAGCTA GAGTTTAAAC TTGACAGATG AGACAATAAC CCTGATAAAT GCTTCAATAA 
641 TATTGAAAAA GGAAAAGTAT GAGTATTCAA CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT 
721 TTTTGCTCAC CCAGAAACGC TGGTGAAAGT AAAAGATGCA GAAGATCACT TGGGTGCGCG AGTGGGTTAC ATCGAACTGG 
801 ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AGAACGTTTC CCAATGATGA GCACTTTTAA AGTTCTGCTA 
881 TGTGGCGCGG TATTATCCCG TATTGATGCC GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT 
961 TGAATACTCA CCAGTCACAG AAAAGCATCT TACGGATGGC ATGACAGTAA GAGAATTATG CAGTGCTGCC ATAACCATGA 
1041 GTGATAACAC TGCGGCCAAC TTACTTCTGA CAACTATCGG AGGACCGAAG GAGCTAACCG CTTTTTTGCA CAACATGGGG 
1121 GATCATGTAA CTCGCCTTGA TCGTTGGGAA CCGGA6CTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA CCACGATGCC 
1201 TGTAGCAATG GCAACAACGT TGCGAAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA CTAATAGACT 
1281 GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCACTTCCG GCTGGCTGGT TTATTGCTGA TAAATCAGGA 
1361 GCCGGTGAGC GTGGGTCACG CGGTATCATT GCAGCACTG6 GGCCGGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC 
1441 TACGGGGAGT CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA TAGGTGCCTC ACTGATTAAG CATTGGTAAG 
1521 GATAAATTTC TGGTAAGGAG GACACGTATG GAAGTGGGCA AGTTGGGGAA GCCGTATCCG TTGCTGAATC TGGCATATGT 
1601 GGGAGTATAA GACGCGCAGC GTCGCATCAG GCATTTTTTT CTGCGCCAAT GCAAAAAGGC CATCCGTCAG GATGGCCTTT 
1681 CGGCATAACT AGTGAGGCTC CGGTGCCCGT CAGTGGGCAG AGCGCACATC GCCCACAGTC CCCGAGAAGT TGGGGGGAGG 
1761 GGTCGGCAAT TGAACCGGTG CCTAGAGAAG GTGGCGCGGG GTAAACTGGG AAAGTGATGT CGTGTACTGG CTCCGCCTTT 
1841 TTCCCGAGGG TGGGGGAGAA CCGTATATAA GTGCAGTAGT CGCCGTGAAC GTTCTTTTTC GCAACGGGTT TGCCGCCAGA 
1921 ACACAGGTAA GTGCCGTGTG TGGTTCCCGC GGGCCTGGCC TCTTTACGGG TTATGGCCCT TGCGTGCCTT GAATTACTTC 
2001 CACCTGGCTG CAGTACGTGA TTCTTGATCC CGAGCTTCGG GTTGGAAGTG GGTGGGAGAG TTCGAGGCCT TGCGCTTAAG 
2081 GAGCCCCTTC GCCTCGTGCT TGAGTTGAGG CCTGGCCTGG GCGCTGGGGC CGCCGCGTGC GAATCTGGTG GCACCTTCGC 
2161 GCCTGTCTCG CTGCTTTCGA TAAGTCTCTA GCCATTTAAA ATTTTTGATG ACCTGCTGCG ACGCTTTTTT TCTGGCAAGA 



FIG. 13(contd.) 

2241 TAGTCTTGTA AATGCGGGCC AAGACGATCT GCACACTGGT 

2321 CGTCCCAGCG CACATGCATG TTCGGCGAGG CGGGGCCTGC 

24 01 TGGCCGGCCT GCTCTGGTGC CTGGCCTCGC GCCGCCGTGT 

2481 GAGGACGCGG CGCTCGGGAG AGCGGGCGGG TGAGTCACCC 

2561 CATGTGACTC CACGGAGTAC CGGGCGCCGT CCAGGCACCT 

2641 TGGGGGGAGG GGTTTTATGC GATGGAGTTT CCCCACACTG 

2721 GTAATTCTCC TTGGAATTTG CCCTTTTTGA GTTTGGATCT 

2801 CGACTCACTA TAGGGAGACT TCTTTCTCCC ATTTCAGGTG 

2881 CGCCACCATG ACTTCACCAA ACGAACTAAA TAAGCTGCCA 

2961 ACACAGAATT CAAAATATCT GTGTTGAAGA ACCTCAAGGA 

3041 GACAAATATA AGAAACAGAT TGAAATAATT AAAGGGAATC 

3121 TCACCATCAC CATCACCATT GAAACCCAGC TTTCTTGTAC 

3201 GCCCTCCAGC TCAAGGCGGG ACAGGTGCCC TAGAGTAGCC 

32 81 ACCTCCATCT CTTCCTCAGG TCTGCCCGGG TGGCATCCCT 

3361 TACTCCAGTG CCCACCAGCC TTGTCCTAAT AAAATTAAGT 

3441 TGGGGTGGAG GCGGGTGGTA TGGAGCAAGG GGCCCAAGTT 

3521 ATAGCATCAC AAATTTCACA AATAAAGCAT TTTTTTCACT 

36 01 TATCATGTCT GGATCCGCTT CAGGCACCGG GCTTGCGGGT 

3681 GCGGTGACGG TGAAGCCGAG CCGCTCGTAG AAGGGGAGGT 

3761 GCGCTCGGCC GCCTCCACTC CGGGGAGCAC GACGGCGCTG 

3841 TGGCCAGGAA CCACGCGGGC TCCTTGGGCC GGTGCGGCGC 

3921 CCGCTCAACT CGGCCATGCG CGGGCCGATC TCGGCGAACA 

4001 CACCGCGGCG CCGTCGTCCG CGACCCACAC CTTGCCGATG 

4081 TGACCCGCTC GATGTGGCGG TCCGGGTCGA CGGTGTGGCG 

4161 ACGGCCCGGG GGACGTCGTC GCGGGTGGCG AGGCGCACCG 

4241 GTGTTCGAGG CCACACGCGT CACCTTAATA TGCGAAGTGG 

4321 GCCAATGACA AGACGCTGGG CGGGGTTTGT GTCATCATAG 

4401 CAGCAAACGC GAGCAACGGG CCACGGGGAT GAAGCAGCTG 

44 81 TAGCATATGC TTCCCGGGTA GTAGTATATA CTATCCAGAC 

4561 TATGCTATCG AATTAGGGTT AGTAAAAGGG TCCTAAGGAA 

4641 GGTAGCATAT GCTATCCTAA TCTATATCTG GGTAGCATAG 

4721 TCTATATCTG GGTAGTATAT GCTATCCTAA TTTATATCTG 



ATTTCGGTTT TTGGGGCCGC GGGCGGCGAC GGGGCCCGTG 
GAGCGCGGCC ACCGAGAATC GGACGGGGGT AGTCTCAAGC 
ATCGCCCCGC CCTGGGCGGC AAGGCTGGGA GCTCAAAATG 
ACACAAAGGA AAAGGGCCTT TCCGTCCTCA GCCGTCGCTT 
CGATTAGTTC TCGAGCTTTT GGAGTACGTC GTCTTTAGGT 
AGTGGGTGGA GACTGAAGTT AGGCCAGCTT GGCACTTGAT 
TGGTTCATTC TCAAGCCTCA GACAGTGGTT CAAATTAATA 
TCGTAAGCTA TCAAACAAGT TTGTACAAAA AAGCAGGCTT 
TGGACCAATC CTGGAGAAAC AGAGATATGT GACCTTTCAG 
AATTCAAGAT AACACAGAGA AGGAATCCAG AATTCTATCA 
AAGCAGAAAT TCTGGAGTTG AGAAATGCAG ATGGCACACT 
AAAGTGGTTC GATGGCCGCA GGTAAGCCAG CCCAGGCCTC 
TGCATCCAGG GACAGGCCCC AGCCGGGTGC TGACACGTCC 
GTGACCCCTC CCCAGTGCCT CTCCTGGTCG TGGAAGGTGC 
TGCATCATTT TGTTTGACTA GGTGTCCTTG TATAATATTA 
AACTTGTTTA TTGCAGCTTA TAATGGTTAC AAATAAAGCA 
GCATTCTAGT TGTGGTTTGT CCAAACTCAT CAATGTATCT 
CATGCACCAG GTGCGCGGTC CTTCGGGCAC CTCGACGTCG 
TGCGGGGCGC GGAGGTCTCC AGGAAGGCGG GCACCCCGGC 
CCCAGACCCT TGCCCTGGTG GTCGGGCGAG ACGCCGACGG 
CAGGAGGCCT TCCATCTGTT GCTGCGCGGC CAGCCTGGAA 
CCGCCCCCGC TTCGACGCTC TCCGGCGTGG TCCAGACCGC 
TCGAGCCCGA CGCGCGTGAG GAAGAGTTCT TGCAGCTCGG 
CGTGGCGGGG TAGTCGGCGA ACGCGGCGGC GAGGGTGCGT 
TGGGCTTGTA CTCGGTCATG GTGGCCTGCA GAGTCGCTCT 
ACCTGGGACC GCGCCGCCCC GACTGCATCT GCGTGTTTTC 
AACTAAAGAC ATGCAAATAT ATTTCTTCCG GGGACACCGC 
CGCCACTCCC TGAAGATCCC CCTTATTAAC CCTAAACGGG 
TAACCCTAAT TCAATAGCAT ATGTTACCCA ACGGGAAGCA 
CAGCGATCTG GATAGCATAT GCTATCCTAA TCTATATCTG 
GCTATCCTAA TCTATATCTG GGTAGCATAT GCTATCCTAA 
GGTAGCATAG GCTATCCTAA TCTATATCTG GGTAGCATAT 



4801 GCTATCCTAA TCTATATCTG GGTAGTATAT GCTATCCTAA TCTGTATCCG GGTAGCATAT GCTATCCTCA TGCATATACA 

4881 GTCAGCATAT GATACCCAGT AGTAGAGTGG GAGTGCTATC CTTTGCATAT GCCGCCACCT CCCAAGGAGA TCCGCATGTC 

4961 TGATTGCTCA CCAGGTAAAT GTCGCTAATG TTTTCCAACG CGAGAAGGTG TTGAGCGCGG AGCTGAGTGA CGTGACAACA 

5041 TGGGTATGCC CAATTGCCCC ATGTTGGGAG GACGAAAATG GTGACAAGAC AGATGGCCAG AAATACACCA ACAGCACGCA 

5121 TGATGTCTAC TGGGGATTTA TTCTTTAGTG CGGGGGAATA CACGGCTTTT AATACGATTG AGGGCGTCTC CTAACAAGTT 

5201 ACATCACTCC TGCCCTTCCT CACCCTCATC TCCATCACCT CCTTCATCTC CGTCATCTCC GTCATCACCC TCCGCGGCAG 

5281 CCCCTTCCAC CATAGGTGGA AACCAGGGAG GCAAATCTAC TCCATCGTCA AAGCTGCACA CAGTCACCCT GATATTGCAG 

5361 GTAGGAGCGG GCTTTGTCAT AACAAGGTCC TTAATCGCAT CCTTCAAAAC CTCAGCAAAT ATATGAGTTT GTAAAAAGAC 

5441 CATGAAATAA CAGACAATGG ACTCCCTTAG CGGGCCAGGT TGTGGGCCGG GTCCAGGGGC CATTCCAAAG GGGAGACGAC 

5521 TCAATGGTGT AAGACGACAT TGTGGAATAG CAAGGGCAGT TCCTCGCCTT AGGTTGTAAA GG6AGGTCTT ACTACCTCCA 

5601 TATACGAACA CACCGGCGAC CCAAGTTCCT TCGTCGGTAG TCCTTTCTAC GTGACTCCTA GCCAGGAGAG CTCTTAAACC 

5681 TTCTGCAATG TTCTCAAATT TCGGGTTGGA ACCTCCTTGA CCACGATGCT TTCCAAACCA CCCTCCTTTT TTGCGCCTGC 

5761 CTCCATCACC CTGACCCCGG GGTCCAGTGC TTGGGCCTTC TCCTGGGTCA TCTGCGGGGC CCTGCTCTAT CGCTCCCGGG 

5841 GGCACGTCAG GCTCACCATC TGGGCCACCT TCTTGGTGGT ATTCAAAATA ATCGGCTTCC CCTACAGGGT GGAAAAATGG 

5921 CCTTCTACCT GGAGGGGGCC TGCGCGGTGG AGACCCGGAT GATGATGACT GACTACTGGG ACTCCTGGGC CTCTTTTCTC 

6001 CACGTCCACG ACCTCTCCCC CTGGCTCTTT CACGACTTCC CCCCCTGGCT CTTTCACGTC CTCTACCCCG GCGGCCTCCA 

6081 CTACCTCCTC 6ACCCCGGCC TCCACTACCT CCTCGACCCC GGCCTCCACT GCCTCCTCGA CCCCGGCCTC CGGCACCTCC 

6161 TCCAGCCCCA GCACCTCCAC CAGCCCCAGC TCCCCCAGCT CCAGCCCCAC CAGCACCAGC CCCTCCAGCC CCACCAGCCC 

6241 CAGCCCCTCC GGCACCTCCT CCAGCCCCAG CACCTCCACC AGCCCCAGCT CCCCCAGCTC CAGCCCCAGC AGCACCAGCC 

6321 CCTCCAGCCC CACCAGCCCC AGCCCCTCCT GTTCCACCGT GGGTCCCTTT GCAGCCAATG CAACTTGGAC GTTTTTGGGG 

6401 TCTCCGGACA CCATCTCTAT GTCTTGGCCC TGATCCTGAG CCGCCCGGGG CTCCTGGTCT TCCGCCTCCT CGTCCTCGTC 

6481 CTCTTCCCCG TCCTCGTCCA TGGTTATCAC CCCCTCTTCT TTGAGGTCCA CTGCCGCCGG AGCCTTCTGG TCCAGATGT6 

6561 TCTCCCTTCT CTCCTAGGCC ATTTCCAGGT CCTGTACCTG GCCCCTCGTC AGACATGATT CACACTAAAA GAGATCAATA 

6641 GACATCTTTA TTAGACGACG CTCAGTGAAT ACAGGGAGTG CAGACTCCTG CCCCCTCCAA CAGCCCCCCC ACCCTCATCC 

6721 CCTTCATGGT CGCTGTCAGA CAGATCCAGG TCTGAAAATT CCCCATCCTC CGAACCATCC TCGTCCTCAT CACCAATTAC 

6801 TCGCAGCCCG GAAAACTCCC GCTGAACATC CTCAAGATTT GCGTCCTGAG CCTCAAGCCA GGCCTCAAAT TCCTCGTCCC 

6881 CCTTTTTGCT GGACGGTAGG GATGGGGATT CTCGGGACCC CTCCTCTTCC TCTTCAAGGT CACCAGACAG AGATGCTACT 

6961 GGGGCAACGG AAGAAAAGCT GGGTGCGGCC TGTGAAGCTA AGATCTGTCG ACATCGATGG GCGCGGGTGT ACACTCCGCC 

7041 CATCCCGCCC CTAACTCCGC CCAGTTCCGC CCATTCTCCG CCTCATGGCT GACTAATTTT TTTTATTTAT GCAGAGGCCG 

7121 AGGCCGCCTC GGCCTCTGAG CTATTCCAGA AGTAGTGAGG AGGCTTTTTT GGAGGCCTAG GCTTTTGCAA AAAGCTAATT 

FIG- 13(contd.) 



FIG. 14 



BLASTP V NCBI nr 

Query= INSP037.pep 

(78 letters) 

Database: All non- redundant GenBank CDS 
translations+PDB+SwissProt+PIR+PRF 

1,446,218 sequences; 465, 230, 387 total letters 

Searching done 



Sequences producing significant alignments; 



Score E 
(bits) Value 



ref |XP_211857.1| hypothetical protein XP_211857 [Homo sapiens] 
ref |XP_112161.2 I similar to putative RNA binding protein 1 [Ratt... 
ref |XP_220945.l| similar to keratin 21, type I, cytoskeletal - r... 
ref |NP_775151.1| cytokeratin 21 [Rattus norvegicus] >gi | 125089 | s . 
gb|AAD49229.2|AF159462_l EHEC factor for adherence [Escherichia ... 
gb|AAL57562.l|AF453441_46 Efal [Escherichia coli] 
emb|CAB55629.1 I lymphostatin [Escherichia coli] 
emb|CAC81883.l| Ef al-Lif A-Tox protein [Escherichia coli] 
gb|AAA39399.l| ORFl 

pir||T36223 hypothetical protein SCE39.13C - Streptorayces coelic... 
>ref |XP_211857.l| hypothetical protein XP_211857 [Homo sapiens] 
Length =113 



109 8e-24 

38 0.041 

37 0.069 

37 0.069 

35 0.26 

35 0.26 

35 0.26 

35 0.26 

35 0.34 

34 0.59 



Score = 109 bits (273), Expect = 8e-24 
Identities = 54/74 (72%) , Positives = 63/74 (84%) 



Query: 1 MTSPNELNKLPWTNPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEI 60 

MTSPNELN+ P TNP ETEIC++ D EFKI+VL+ L EIQDNTEKE ++LSDK K+IEI 
Sbjct: 1 MTSPNELNEAPGTNPAETEICNILDREFKIAVLRKLNEIQDNTEKELKVLSDKIIKEIEI 60 



Query: 61 IKGNQAEILELRNA 74 

IK NQAEILEL+NA 
Sbjct: 61 IKMNQAEILELKNA 74 



BLAST V month-aa 



FIG. 15 



Query= lNSP037.pep 

(78 letters) 

Database: NCBI : Rolling month (30 days) of new/revised protein 
sequences 

37,755 sequences; 14,558,446 total letters 
Searching done 

Score E 

Sequences producing significant alignments: (bits) Value 



ref |XP_141262. 


1| 


similar to NAG14 protein [Homo sapiens] [Mus mu. . . 


30 


0 


27 


ref |nP_831679. 


1| 


Phage- related protein [Bacteriophage phBC6A51] . . . 


30 


- 0 


36 


ref |NP__083191,1| 


RIKEN cDNA 1200008A14 [Mus musculus] >gi | 128359 .. . 


29 


0 


61 


ref |NP_852012. 


1| 


neck appendage [Streptococcus phage CI] >gi|309... 


28 


0 


80 


ref |NP_064648. 


1| 


neurexin I; neurexin I beta; neurexin I alpha; ... 


28 


1 


0 


ref |XP_319358. 


1| 


ENSANGP00000006161 [Anopheles gambiae] >gi|2130... 


28 


1 


0 


ref |XP_308412. 


1| 


ENSANGP00000019827 [Anopheles gambiae] >gi|2129... 


28 


1 


0 


ref |NP_196806. 


2| 


expressed protein [Arabidopsis thaliana] 


27 


1, 


8 


gb|AAL29689.l| 


Snf2-related chromatin remodeling factor SRCAP [T. . . 


27 


1, 


8 


ref |XP_314825. 


ii 


ENSANGP00000011098 [Anopheles gambiae] >gi|2129... 


27 


1. 


8 


ref |XP_311503 . 


11 


ENSANGP00000013657 [Anopheles gambiae] >gi|2129... 


27 


2. 


3 



>ref |XP_141262.1| similar to NAG14 protein [Homo sapiens] [Mus musculus] 
ref |XP_230311.l| similar to NAG14 protein [Homo sapiens] [Rattus norvegicus] 
ref |NP__848840.1| RIKEN cDNA 6430556C10 gene [Mus musculus] 
dbj |BAC2 8656.l| unnamed protein product [Mus musculus] 
dbj |baC33302 . 1 1 unnamed protein product [Mus musculus] 
Length = 640 

Score = 30.0 bits (66), Expect = 0.27 

Identities = 22/59 (37%), Positives = 33/59 (55%), Gaps = 8/59 (13%) 

Query: 20 ICDLSDTEFK-ISVLKNLKEIQDNTEKESRILSDKYKKQIEIIKGN QAEILEL 71 

+C S+ K I V KNL+E+ D +R+L + ++ QI+IIK N EIL+L 

Sbjct : 50 VCSCSNQFSKVICVRKNLREVPDGISTNTRLL-NLHENQIQIIKVNSFKHLRHLEILQL 107 



FIG. 16A 

TBLASTN V NCBI nt-month 



Query= INSP037.pep 

(78 letters) 



Database: NCBI: Rolling month (30 days) of new/ revised nt sequences 
(GenBank+ EMBL + DDE J sequences (but no EST, STS, GSS, or phase 0, 
or 2 HTGS sequences) ) 

44,426 sequences; 216,324,491 total letters 



Searching 

Sequences producing significant alignments: 



.done 



gb|AC093724. 
emb|BX510371 
gb|AC144561. 
dbj |AP001827 
emb|Z97632.3 
eiIlb|BX322234 
dbj|AP005138 
dbj|AP006292 
gb|AC083903. 
gb|Ay293855 



3 1 Homo sapiens BAC clone RP11-1L5 from 2, complete . 
.4 1 Human DNA sequence from clone RP13-728A10 on chr. 
8 1 Homo sapiens 3 BAC RP11-628C23 (Roswell Park Cane. 
.5 1 Homo sapiens genomic DNA, chromosome 11 clone:RP. 
|HS196E23 Human DNA sequence from clone RP1-196E23 o. 
.7 1 Human DNA sequence from clone XXyac-65C7_A on ch. 
.3 1 Homo sapiens genomic DNA, chromosome 18 clone :RP. 
.2 1 Homo sapiens genomic DNA, chromosome 9 clone :RP1. 
loj Homo sapiens chromosome UNK clone RP11-785G23, c. 
l| Homo sapiens insulin-like growth factor 2 recepto. 



1 




Score 


E 


(bits) 


Value 


105 


2e-23 


89 


2e-18 


82 


4e-16 


80 


le-15 


66 


3e-ll 


62 


5e-10 


54 


le-07 


54 


le-07 


47 


le-05 


45 


7e-05 



>gb|AC093724.3| Homo sapiens BAC clone RP11-1L5 from 2, 
Length = 161617 



complete sequence 



Score = 105 bits (263), Expect = 2e-23 
Identities = 55/78 (70%), Positives = 62/78 
Frame = -3 



(78%) 



Query: 1 MTSPNELNKLPWTNPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEI 60 

MTSPNELNK P NP ET++CDLS EFKI+VL+ LKEIQDNTEK RILSDKf K lEI 
Sbjct: 22538 MTSPNELNKAPRINPQETKLCDLSHGEFKIAVLRKLKEIQDNTEKGFRILSDKFNKDIEI 22359 

(Juery: 61 IKGNQAEILELRNADGTL 78 

I -i-AEILEL-i-NA 6 L 
Sbjct: 22358 IFKTRAEILELKNAIGIL 22305 

Score =30.0 bits (66), Expect =1.7 

Identities = 19/60 (31%), Positives = 35/60 (57%) 

Frame = +3 

Query: 14 NPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEIIKGNQAEILELRN 73 

+P + El DLS+ EFK+ V+K ++E + E + + K +K 1+ +KG + ++ N 
Sbjct: 111237 DPNKEEITDLSBKEFKL-VIKLIREGPEKGEAQCK KIQKVIQ*VKGETFKEIDSLN 111401 



TBLASTNvNCBInt FIG. 16B 



Query= INSP037.pep 

(78 letters) 

Database: All GenBank+EMBL+DDBJ+PDB sequences (but no EST, STS, GSS, 
or phase 0, 1 or 2 HTGS sequences) 

1,794,754 sequences; 8,367,844,792 total letters 



Searching . 



.done 



Sequences producing significant alignments: 



Score E 
(bits) Value 



gb|AC112641.3| Homo sapiens 3 SAC RP11-431I8 (Roswell Park Cance.. 
gblAC026118.17| Homo sapiens 3 BAG RP11-67F24 (Roswell Park Cane. 
emb|AL020989.2|HS192P9 Human DNA sequence from clone RP1-192P9 o. . 
gb|AC00981l.l4| Homo sapiens chromosome 3, clone RP11-491K7, com.. 
gb|AC108166.5| Homo sapiens BAG clone RP11-724L20 from 4, comple. . 
gb|AC011299.3|AC011299 Homo sapiens BAG clone RP11-232C20 from 7,. 
gb| AC144613 . 1 1 Pan troglodytes chromosome 7 clone RP43-1F6, comp. . 
dbj |AP001992 .4 I Homo sapiens genomic DNA, chromosome llq clone:R.. 
emb|AL359393.9| Human DNA sequence from clone RP11-338I3 on chro. . 
emb|AL353577.22 I Human DNA sequence from clone RP11-661K19 on ch. . 



158 


2e- 


•37 


158 


2e- 


37 


117 


3e- 


25 


116 


7e- 


25 


115 


9e- 


25 


115 


le- 


24 


115 


le- 


24 


115 


le- 


24 


114 


2e- 


24 


114 


2e- 


24 



>gb|AC112641.3 I Homo sapiens 3 BAG RP11-431I8 (Roswell Park Cancer Institute Human BAG 
Library) complete sequence 
Length = 165619 

Score = 158 bits (399), Expect = 2e-37 

Identities = 78/78 (100%), Positives = 78/78 (100%) 

Frame = +3 

Query: 1 MTSPNELNKLPWTNPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEI 60 

MTSPNELNKLPWTNPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEI 
Sbjct: 47052 MTSPNELNKLPWTNPGETEICDLSDTEFKISVLKNLKEIQDNTEKESRILSDKYKKQIEI 47231 



Query: 61 IKGNQAEILELRNADGTL 78 

IKGNQAEILELRNADGTL 
Sbjct: 47232 IKGNQAEILELRNADGTL 47285 
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FIG. 18C 
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FIG. 18D 
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